RTP payload format

ABSTRACT

A data stream is encrypted to form encryption units that are packetized into RTP packets. Each RTP packet includes an RTP packet header, one or more payloads of a common data stream, and a RTP payload format header for each payload and including, for the corresponding encryption units, a boundary for the payload. The payload can be one or more of the encryption units or a fragment of one of the encryption units. The encryption units are reassembled the using the payloads in the RTP packets and the respective boundary in the respective RTP payload format header. The reassembled of encryption units are decrypted for rendering. Each RTP payload format header can have attributes for the corresponding payload that can be used to render the payload. The RTP packets can be sent server-to-client or peer-to-peer.

TECHNICAL FIELD

The present invention relates to Real-Time Transport Protocol (RTP) andmore particularly to an RTP wire format for streaming media (e.g.audio-video) over a network, such as the Internet.

BACKGROUND OF THE INVENTION

The following discussion assumes that the reader is familiar with theIETF RFC 1889 standard—RTP: A Transport Protocol for Real-TimeApplications and with the IETF RFC 1890 standard—RTP Profile for Audioand Video Conferences with Minimal Control.

Real-time transport protocol (RTP), as defined in the RFC 1889 standard,provides end-to-end network transport functions suitable forapplications transmitting real-time data, such as audio, video orsimulation data, over multicast or unicast network services. Thesetransport functions provide end-to-end delivery services for data withreal-time characteristics, such as interactive audio and video. Suchservices include payload type identification, sequence numbering, timestamping and delivery monitoring. RTP supports data transfer to multipledestinations using multicast distribution if provided by the underlyingnetwork.

The RFC 1889 standard does not provide any mechanism to ensure timelydelivery or provide other quality-of-service guarantees, but relies onlower-layer services to do so. It does not guarantee delivery or preventout-of-order delivery, nor does it assume that the underlying network isreliable and delivers packets in sequence. The sequence numbers includedin RTP allow the receiver to reconstruct the sender's packet sequence,but sequence numbers might also be used to determine the proper locationof a packet, for example in video decoding, without necessarily decodingpackets in sequence.

A typical application of RTP involves streaming data, where packets ofAdvanced Systems Format (ASF) audio-visual (AV) data is sent in RTPpackets over a network from a server to a client or peer-to-peer. TheASF audio and video data can be stored together in one ASF packet. Assuch, an RTP packet can contain both audio and video data.

RTP, as defined the RFC 1889 standard, lacks flexibility to groupmultiple payloads together into a single RTP packet, and to split apayload across multiple RTP packets. Neither does the RFC 1889 standarddefine a format in which metadata can be delivered with each payload inan RTP packet. Another deficiency of the RFC 1889 standard is the lackof a mechanism for streaming encrypted blocks of data across a networkwhile maintaining a block boundary of each encrypted block such that therecipient thereof can decrypt the encrypted blocks of data. In would bean advance in the art to provide such flexibility as an enhancement toRTP streaming. Consequently, there is a need for improved methods,computer-readable medium, data structures, apparatus, and computingdevices that can provide such flexibility.

SUMMARY

In one implementation, packets of Advanced Systems Format (ASF)audio-visual (AV) data are repacketized into Real-Time TransportProtocol (RTP) packets and sent over a network from a server to clientor by peer-to-peer network communications in response to a request tostream the AV data. The AV data is encrypted to form encryption units.The repacketizing process includes packetizing the encryption units intothe RTP packets each of which includes an RTP packet header, one or morepayloads of a common data stream, and a RTP payload format (PF) headerfor each payload. The RTP PF header includes, for the correspondingencryption units, a boundary for the payload. The payload in the RTPpacket can be one or more encryption units or a fragment of anencryption unit. After the RTP packets are sent over a network, theencryption units contained in the received RTP packets are reassembled.The reassembly process uses the payloads in the RTP packets and therespective boundary in the respective RTP PF header. The reassembledencryption units can be decrypted for rendering. Each RTP PF header canhave attributes for its corresponding payload that can be used to renderthe payload.

In a variation on the foregoing implementation, data in a format otherthan ASF is used to form the RTP packets. In a still further variationon the foregoing implementation, the RTP packets are formed so as tocontain payloads that are not encrypted.

In yet another implementation, a wire format is provided for streamingencrypted blocks of data protected with Windows® Media Digital RightsManagement (WM DRM) across a network in RTP packets (e.g., streaming WMDRM protected content). Each RTP packet contains header data to maintainencryption block boundaries so that each encryption unit can bedecrypted by the recipient thereof. Upon decryption using the WM DRMprotocol, the streaming data can be rendered by the recipient.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an illustration of an exemplary process, in accordance with anembodiment of the invention, for the transformation of two (2) packetsof Advanced Systems Format (ASF) audio-visual (AV) data into four (4)RTP packets, where the audio data and the video data are packetizedseparately in the resultant RTP packets, and where block boundaries foreach payload are preserved such that original AV samples that wereencrypted and packetized in the two ASF packets can be reconstructed bya decryption mechanism.

FIG. 2 is an illustration of alternative exemplary processes, inaccordance with different embodiments of the invention, for thetransformation of two (2) packets of ASF video data into one (1) RTPpacket, where one alternative process moves the payloads of the ASFpackets into separate payloads in the RTP packet, where the otheralternative process combines the payloads of the ASF packets into acombined payload in the RTP packet, and where block boundaries for eachpayload are preserved such that an original video sample that wasencrypted and packetized in the two ASF packets can be reconstructed bya decryption mechanism.

FIGS. 3 a-3 b are respective data structure layouts, in accordance withan embodiment of the present invention, for an RTP header and acorresponding payload header.

FIG. 4 is a block diagram, in accordance with an embodiment of thepresent invention, of a networked client/server system in whichstreaming can be performed by server to client or peer to peer.

FIG. 5 is a block diagram, in accordance with an embodiment of thepresent invention, illustrating communications between a server (orclient) and a client, where the server (or client) serves to the clienta requested audio-visual data stream that the client can render.

FIG. 6 is a block diagram, in accordance with an embodiment of thepresent invention, of a networked computer that can be used to implementeither a server or a client.

DETAILED DESCRIPTION

Implementations disclosed herein define wire formats for delivery ofsingle and mixed data streams, such as Windows® media data via Real-TimeTransport Protocol (RTP). The delivery can be between server and client,as well as in a peer to peer context (e.g., a Windows® Messenger™audio-visual conference software environment).

A wire format, in various implementations, enhances the IETF RFC 1889standard to provide greater flexibility for RTP delivery.Implementations provide a mechanism for streaming of audio data in RTPpackets that are separate from video data in RTP packets.Implementations also provide a wire format in which metadata can bedelivered with each payload in an RTP packet, where the metadataprovides rich information that is descriptive of the payload. Stillother implementations provide a mechanism for streaming encrypted blocksof data across a network while maintaining a block boundary of eachencrypted block such that the recipient thereof can decrypt theencrypted blocks of data. In another implementation, a wire formatprovides for delivery of data that is protected with Windows® MediaDigital Rights Management (WM DRM) such that the delivery thereof can beunencrypted for rendering.

Various implementations disclosed herein repackage data in a series ofmedia packets that are included in a system layer bit stream. These dataare packetized into RTP packets consistent with, yet enhancing, the RFC1889 standard such that the system layer bit stream is mapped to RTP. Inthis mapping, each media packet contains one or more payloads. In somesystem layer bit streams, there may be mixed media packets having datasuch as audio data, video data, program data, JPEG Data, HTML data, MIDIdata, etc. A mixed media packet is a media packet where two or more ofits payloads belong to different media streams.

Various implementations apply to system layer bit streams where eachmedia packet is a single media packet. In a single media packet, all ofthe payloads in the media packet belong to the same media stream. Otherimplementations apply to system layer bit streams where each mediapacket always contains only one (1) payload. In still furtherimplementations, the size of the “payload header” in the media packet iszero—which is likely if each media packet only contains a singlepayload, but could also happen when there are multiple payloads wherethe media packet header contains information about the size of eachpayload.

FIGS. 1-2 depict exemplary implementations in which the system layer bitstreams include a series of Advanced Systems Format (ASF) packets eachhaving data therein. These data are packetized into RTP packetsconsistent with, yet enhancing, the RFC 1889 standard. As such, thesystem layer bit streams includes a series of media packets that are ASFpackets, and the payload in each ASF packet is an ASF payload. While ASFpackets are being used for illustration, the creation of RTP packets, inother implementations disclosed herein, is not limited to the use of ASFformat data but may rather use other formats in which data to bestreamed is stored. These other formats, as well as the ASF format, aregenerally described herein as system layer bit streams that include aplurality of media packets each having data therein, where these dataare mapped to RTP in various implementations.

ASF Steaming Audio-Visual (AV) data 100 is depicted in FIG. 1. The ASFStreaming AV data 100, which includes audio data 102 and video data 104,has been packetized into an ASF packet A 106 and an ASF packet B 108.ASF packet A 106 includes a first ASF header, an ASF payload header,audio data 102, a second ASF header, and a video data A fragment ofvideo data 104. ASF packet B 108 includes an ASF header, an ASF payloadheader, and a video data B fragment of video data 104.

The ASF Streaming AV data 100 as expressed in ASF packet A 106 and ASFpacket B 108, in one implementation, can be packetized into a pluralityof RTP packets. As seen in FIG. 1, these include RTP packet A 110, RTPpacket 112(1) through RTP packet 112(N), and RTP packet D 116. Each RTPpacket, in accordance with the RFC 1889 standard, has an RTP packetheader, a payload, and an RTP payload format (PF) header. As used hereinthe RTP PF header is a payload header in the RTP packet. Only one (1)type of media is in the RTP packet. Stated otherwise, the RTP packetdoes not contain mixed media payloads. In the implementation depicted inFIG. 1, video data A of ASF packet A 106 is too large to fit into asingle RTP packet. As such, video data A of ASF packet A 106 is dividedamong RTP packet 112(1) through RTP packet 112(N). The RTP packet sizecan be a function of a physical characteristic of an underlying networkover which the RTP packets are to be transmitted, or an administrativepolicy with respect to packet size such as can be made by theadministrator of the underlying network, or an assessment of thetransmission bandwidth of the underlying network.

Following the RTP packetization depicted in FIG. 1, audio data 102 isincluded in RTP packet A 110 and video data B of ASF packet B 108 isincluded in RTP packet D 116. Each RTP PF header of each RTP packet cancontain information relating to the separation of the audio and videodata into respectively separate RTP packets. Thus, A/V streaming sampledata 124 can be reconstructed from the audio data in RTP packet A 110,video data A fragment 1 through video data A fragment N in respectiveRTP packets 112(1) through 112 (N), and video data B in RTP packet D116. Once the reconstruction of A/V streaming sample data 124 iscomplete, the audio sample data 120 and the video sample data A+B 122therein can be rendered in a streaming context. Given the foregoing,FIG. 1 illustrates a wire format in which smaller RTP packets arecreated from larger ASF packets, where the packetization puts a payloadof different data streams into separate packets each with its own RTP PFheader. FIG. 1 also illustrates an implementation of a wire format inwhich block boundaries for each payload are preserved such that originalaudio and video samples that were encrypted and packetized in ASFpackets can be reconstructed by a decryption mechanism that is performedupon the RTP packets.

ASF Steaming AV data 200 is depicted in FIG. 2. The ASF Streaming AVdata 200, which includes video data 202, has been packetized into an ASFpacket A 208 and an ASF packet B 210. ASF packet A 208 includes an ASFheader, an ASF payload header, and video data A 204. ASF packet B 210includes an ASF header, an ASF payload header, and a video data B 206.FIG. 2 shows two (2) alternatives for packetizing ASF Streaming AV data200 into RTP packets consistent with, yet enhancing, the RFC 1889standard.

In the first alternative, following arrow 250, video data A 204 andvideo data B 206 are packetized into a single RTP packet alternative A212 having an RTP header. Each of video data A 204 and video data B 206is preceded by an RTP PF header. RTP packet alternative A 212, inaccordance with the RFC 1889 standard, has an RTP header, multiplepayloads, and respective RTP PF headers.

In the second alternative, also following arrow 250, video data A 204and video data B 206, from respective ASF packets, are packetized intoan RTP packet alternative B 214 having an RTP header. Video data A 204and video data B 206 are assembled contiguously as the payload in RTPpacket alternative B 214. The payload is preceded by an RTP PF header.RTP packet alternative B 214, in accordance with the RFC 1889 standard,has an RTP header, a payload, and one RTP PF header.

Following the RTP packetization depicted in FIG. 2, video data A and B(204, 206) are included in either RTP packet alternative A 212 or in RTPpacket alternative B 214. Each RTP PF header can contains informationrelating to the corresponding payload. Each of the alternative RTPpackets 212, 214 contain sufficient data to reconstruct ASF packet A 208and ASF packet B 210 so as to obtain therein video data A and B (204,206). Once the reconstruction of is complete, the video sample data 222can be rendered in a streaming context. Given the foregoing, FIG. 2illustrates an RTP wire format in which larger RTP packets are createdfrom small ASF packets, and where block boundaries for each payload arepreserved such that original video samples that were encrypted andpacketized in the two ASF packets can be reconstructed by a decryptionmechanism that is performed upon the RTP packets.

FIG. 3 a depicts a data structure layout for fields in an RTP header.The RTP header is more fully described in the RFC 1889 standard. Thetimestamp field in the RTP header should be set to the presentation timeof the sample contained in the RTP packet. In one implementation, theclock frequency is 1 kHz unless specified to be different through meansindependent of RTP.

The 8th bit from the start of the RTP header is interpreted as a marker(M) bit field. The M bit is set to zero, but will be set to one (“1”)whenever the corresponding RTP packet has payload that is not a fragmentof a sample, contains the final fragment of a sample, or is one of aplurality of complete samples in the RTP packet. The M bit can be usedby a receiver to detect the receipt of a complete sample for decodingand presenting. Thus, the M bit in the RTP header can be used to marksignificant events in a packet stream (e.g., video sample frameboundaries).

FIG. 3 b depicts one implementation of an RTP payload format (PF) Headeror payload header. The RTP PF header has a sixteen (16) bit fixed lengthportion followed by a variable length portion. The fields of the RTP PFheader depicted in FIG. 3 b include a 8-bit string indicated by thecharacter fields “SGLRTDXZ”, a length/offset field, a relative timestampfield, a decompression time field, a duration field, and a PayloadExtension (P.E.) length field and a corresponding P.E. data field, eachof which is explained below.

The S field is one (1) bit in length and is set to one (“1”) if thecorresponding payload (e.g., sample, fragment of a sample, orcombination of samples) is a key sample, i.e. intracoded sample orI-Frame. Otherwise it is set to zero. The S-bit in all RTP PF headerspreceding fragments of the same sample must be set to the same value.

The G field is one (1) bit in length and is used to group sub-samples ina corresponding payload that make up a single sample. Windows® MediaDigital Rights Management (WM DRM) encrypts content based on the “ASFPayload” boundaries. In order to allow this content to be correctlydecrypted, the boundaries of the sub-samples in the payload can becommunicated to the client that is to receive the payload. For instance,an encryption unit can be packetized such that it is broken into aplurality of transmission units (e.g., placed within separate packets)that are to be transmitted. Before the broken plurality of transmissionunits can be decrypted at a receiving client they have to be reassembledinto the original encrypted form. As in other decryption methodologiesand mechanisms, the client can use the boundaries to properlyreconstruct the encrypted encryption units in preparation for decryptionof the encrypted content. As such, each “ASF Payload” should be precededby this RTP PF header.

The G field bit should be set to zero (0“) to indicate that an encrypted“unit” has been fragmented. If ASF is being used, the encryption unitwill be an ASF payload and the bit is set to zero (“0”) on allfragmented ASF payloads, except the last ASF payload. In this case,whether or not a sample has been fragmented doesn't matter. If ASF isnot being used the encryption unit is a media sample, in which case theG bit is set to zero (“0”) on all fragmented media samples except thelast sample. As to this latter case, the concern about whether or not anASF payload has been fragmented is not applicable, since ASF is notused.

The L field is one (1) bit in length and is set to one (“1”) if theLength/Offset field contains a length. Otherwise it is set to zero (“0”)and the Length/Offset field contains an offset. The L-bit must be set toone (“1”) in all RTP PF headers preceding a complete (unfragmented)sample in the corresponding payload and must be set to zero in all RTPPF headers that precede a payload containing a fragmented sample.

The R field is one (1) bit in length and is set to one (“1”) if the RTPPF header contains a relative timestamp. Otherwise it is set to zero.The R-bit in all headers preceding fragments of the same sample must beset to the same value.

The T field is one (1) bit in length and is set to one (“1”) if the RTPPF header contains a decompression time. Otherwise it is set to zero.The T-bit in all RTP PF headers that precede a payload that contains afragment of the same sample must be set to the same value.

The D field is one (1) bit in length and is set to one (“1”) if the RTPPF header contains a sample duration. Otherwise it is set to zero. TheD-bit in all RTP PF headers that precede a payload containing fragmentsof the same sample must be set to the same value.

The X field is one (1) bit in length and is for optional or unspecifieduse. A transmitter of an RTP packet should set this bit to zero and areceiver thereof can ignore this bit.

The Z field is one (1) bit in length and is set to one (“1”) if the RTPPF header contains Payload Extension (P.E.) data, which can be metadataregarding the corresponding payload. Otherwise the Z field is set tozero. The Z field bit could be zero for all RTP PF headers whose M-bitis zero, but it should be set for all RTP PF headers whose M-bit is setto one (“1”) if the corresponding payload has P.E. data associated withit.

The Length/Offset field is twenty four (24) bits in length andquantifies the length or offset of a single sample that has beenfragmented over multiple RTP packets. The L-bit is set to zero and theLength/Offset field contains the byte offset of the first byte of thisfragment from the beginning of the corresponding payload (e.g., sampleor fragment thereof). If one or more complete samples are contained inthe RTP packet, the L-bit is set to one (“1”) in each RTP PF header, andthe Sample Length/Offset field contains the length of the sample(including the RTP PF header).

The Relative Timestamp field is thirty-two (32) bits in length and ispresent only if the R-bit is set to one (“1”). It contains the relativetimestamp for the corresponding sample with respect to the timestamp inthe corresponding RTP header. The timescale used is the same as thatused for the timestamp in the RTP header. The Relative Timestamp fieldis specified as a signed 32-bit number to allow for negative offsetsfrom the timestamp of the RTP header. When the Relative Timestamp fieldis absent, a default relative timestamp of zero can be used.

The Decompression Time is thirty-two (32) bits in length and is presentonly if the T-bit is set to one (“1”). It contains the decompressiontime relative to the timestamp in the RTP header. The timescale used isthe same as that used for the timestamp in the RTP header. This field isspecified as a signed 32-bit number to allow for negative offsets fromthe timestamp in the RTP header.

The Duration field is thirty-two (32) bits in length and is present onlyif the D-bit is set to one (“1”). It contains the duration of thecorresponding sample. The timescale used is the same as that used forthe timestamp in the RTP header. The Duration field, in all RTP PFheaders preceding fragments of the same sample, should be set to thesame value. When this field is absent, the default duration isimplicitly or explicitly obtained from the sample data. If this is notpractical, the default is the difference between this sample's timestampand the next sample's timestamp.

The Payload Extension (P.E.) Data Length field is sixteen (16) bits inlength and is present only if the Z-bit is set to one (“1”). It containsthe number of bytes of P.E. data contained after the fixed part of theRTP PF header. The P.E. data is variable in length and contain one ofmore attributes descriptive of the corresponding payload that itprecedes. The P.E. data length field immediately follows the fixed partof the payload header and will be a number of bytes that contain theactual P.E. data. The structure of the P.E. data is communicated betweenthe client and server (or peer to peer), such as via an SDP description.In one implementation for WM DRM protected content, there can be atleast 4 bytes of DUE data representing the WM DRM payload ID associatedwith every sample.

While FIGS. 3 a-3 b show various fields in various orders for an RTPheader and RTP PF header, not all fields are required and the orderthereof can be rearranged. In some implementations, the required fieldsand order therefore may be consistent with, yet extend, the flexibilityof the RFC 1889 standard. While ASF packets are being used forillustration of FIG. 3 a-3 b, the creation of RTP packets, RTP PFheaders and payloads therefore, in other implementations disclosedherein, is not limited to the use of ASF format data but may rather useother formats in which data to be streamed is stored.

General Network Structure

FIG. 4 shows a client/server network system 400 and environment inaccordance with the invention. Generally, the system 400 includes one ormore (m) network multimedia servers 402 and one or more (k) networkclients 404. The computers communicate with each other over a datacommunications network, which in FIG. 4 includes a wired/wirelessnetwork 406. The data communications network 406 might also include theInternet or local-area networks and private wide-area networks. Servers402 and clients 404 communicate with one another via any of a widevariety of known protocols, such as the Transmission Control Protocol(TCP) or User Datagram Protocol (UDP).

Multimedia servers/clients 402/404 have access to streaming mediacontent in the form of different media streams. These media streams canbe individual media streams (e.g., audio, video, graphical, simulation,etc.), or alternatively composite media streams including multiple suchindividual streams. Some media streams might be stored as files 408 in adatabase (e.g., ASF files) or other file storage system, while othermedia streams 410 might be supplied to the multimedia server 402 orclient 404 on a “live” basis from other data source components throughdedicated communications channels or through the Internet itself.

The media streams received from servers 402 or from clients 404 arerendered at the client 404 as a multimedia presentation, which caninclude media streams from one or more of the servers/clients 402/404.These different media streams can include one or more of the same ordifferent types of media streams. For example, a multimedia presentationmay include two video streams, one audio stream, and one stream ofgraphical images. A user interface (UI) at the client 404 can allowsusers various controls, such as allowing a user to either increase ordecrease the speed at which the media presentation is rendered.

Exemplary Computer Environment

In the discussion below, the invention will be described in the generalcontext of computer-executable instructions, such as program modules,being executed by one or more conventional personal computers.Generally, program modules include routines, programs, objects,components, data structures, etc. perform particular tasks or implementparticular abstract data types. Moreover, those skilled in the art willappreciate that the invention may be practiced with other computersystem configurations, including hand-held devices, multiprocessorsystems, microprocessor-based or programmable consumer electronics,network PCs, minicomputers, mainframe computers, and the like. In adistributed computer environment, program modules may be located in bothlocal and remote memory storage devices. Alternatively, the inventioncould be implemented in hardware or a combination of hardware, software,and/or firmware. For example, one or more application specificintegrated circuits (ASICs) could be programmed to carry out theinvention.

As shown in FIG. 4, a network system in accordance with the inventionincludes network server(s) and client 402, 404 from which a plurality ofmedia streams are available. In some cases, the media streams areactually stored by server(s) and/or client 402, 404. In other cases,server(s) and/or client(s) 402, 404 can obtain the media streams fromother network sources or devices. Generally, the network clients 404 areresponsive to user input to request media streams corresponding toselected multimedia content. In response to a request for a media streamcorresponding to multimedia content, server(s) and/or clients 402, 404stream the requested media streams to the requesting network client 404in accordance with an RTP wire format. The client 404 decrypts thepayloads in the respective RTP packets and renders the resultantunencrypted data streams to produce the requested multimedia content.

FIG. 5 illustrates the input and storage of A/V streaming data on aserver 402 or a client 404 (e.g., a peer). FIG. 5 also illustratescommunications between server and client (402-404) or peer-to-peer(404-404) in accordance with various implementations. By way ofoverview, the server or client 402, 404 receives input of A/V streamingdata from an input device 530. The server or client 402, 404 encodes theinput using an encoder of a codec. The encoding can, but need not, beperformed on ASF format data. If ASF format data is used, the encodingis performed upon ASF packets that each include an ASF header, and ASFpayload header, and an AV (audio and/or video) payload. The encoding caninclude encryption, such as where WM DRM is used. The ASF packets arestored by the server/client 402, 404 for serving future requests forsame.

Subsequently, the client requests the corresponding AV data stream fromthe server/client. The server/client retrieves and transmits to theclient the corresponding AV stream that the server/client had previouslystored. Upon receipt, the client decodes the AV data stream, andreconstructs and decrypts encrypted broken up AV data stream samplesusing boundaries communicated in the corresponding RTP PF headers. Theclient can then perform rendering of the streamed AV data.

The flow of data in seen in FIG. 5 between and among blocks 504-530. Atblock 504, an input device 502 furnishes to server/client 402/404 inputthat includes A/V streaming data. By way of example, the A/V streamingdata might be supplied to server/client 402/404 on a “live” basis byinput device 502 through dedicated communications channels or throughthe Internet. The A/V streaming data is supplied to an encoder at block504 for placing the data into ASF packets. At block 506, optional WM DRMencryption is employed and the ASF packets are stored at theserver/client 402/404. A result of the WM DRM encryption andpacketization can be that an encryption unit is broken into a pluralityof separate packets. Before the broken plurality of transmission unitscan be decrypted at a receiving client they have to be reassembled atthe client into the original encryption units. As such, the boundariesof the broken transmission units are stored in the ASF payload headersat block 506.

At block 508, client 404 makes a request for the A/V data stream that istransmitted to server/client 402/404 as seen at arrow 510 in FIG. 5. Atblock 512, server/client 402/404 receives the request. The correspondingASF packets that contain the requested A/V data stream are retrieved. Atblock 514, audio and video payloads in the ASF packets are logicallyseparated so that they can be separately packetized into RTP packets.Boundaries for each logically separate audio and video payload areidentified.

A bandwidth of the network over which RTP packets are to be transmittedis determined. This determination is used to derive a predetermined RTPpacket size. Where the ASF packet size is smaller than the predeterminedRTP packet size, like-kind payloads can be combined into a single RTPpacket. Where the ASF packet size is bigger than the predetermined RTPpacket size, ASF payloads can be fragmented for placement as a payloadinto a single RTP packet. Boundaries for each RTP payload are determinedusing the corresponding logically separate audio and video payloads ofthe ASF packets.

At step 516, the RTP header, RTP PF header, and respective payload areassembled for each RTP packet. As such, a plurality of RTP packets havebeen formed that represent a plurality of ASF packets, where the ASFpackets contain the A/V data stream that was requested by client 404.The RTP packets are streamed for rendering at client 404 fromserver/client 402/404 via a transmission function at block 518.

An arrow 520 in FIG. 5 shows transmission of the RTP packets fromserver/client 402/404 to client 404. At block 522, client 404 receivesthe RTP packets. At block 524, an RTP decoder at client 404 decodes eachreceived RTP packet, including the RTP header, and RTP PF header. Atblock 526, a process performs defragmentation and reconstruction of theASF packets containing the requested A/V datastream. The defragmentationand reconstruction uses boundaries set forth in the RTP PF header foreach corresponding payload containing, for instance, a sample orfragment thereof.

At block 528, the reconstructed ASF packets are decrypted for renderingat block 530. The RTP PF header in an RTP packet may contain PayloadExtension (P.E.) data that is descriptive of the corresponding payload.The P.E. data can thus provide metadata that can be used during arendering of the payload in the corresponding RTP packet at block 530.The blocks 522-530 are repeated for each RTP packet that is received atclient 404, thereby accomplishing the streaming of the A/V data fromserver/client 402/404 for rendering.

FIG. 6 shows a general example of a computer 642 that can be used inaccordance with the invention. Computer 642 is shown as an example of acomputer that can perform the functions of any of clients 402 or servers404 of FIGS. 4-5. Computer 642 includes one or more processors orprocessing units 644, a system memory 646, and a system bus 648 thatcouples various system components including the system memory 646 toprocessors 644.

The bus 648 represents one or more of any of several types of busstructures, including a memory bus or memory controller, a peripheralbus, an accelerated graphics port, and a processor or local bus usingany of a variety of bus architectures. The system memory includes readonly memory (ROM) 650 and random access memory (RAM) 652. A cache 675have levels L1, L2, and L3 may be included in RAM 652. A basicinput/output system (BIOS) 654, containing the basic routines that helpto transfer information between elements within computer 642, such asduring start-up, is stored in ROM 650. Computer 642 further includes ahard disk drive 656 for reading from and writing to a hard disk (notshown) a magnetic disk drive 658 for reading from and writing to aremovable magnetic disk 660, and an optical disk drive 662 for readingfrom or writing to a removable optical disk 664 such as a CD ROM orother optical media.

Any of the hard disk (not shown), magnetic disk drive 658, optical diskdrive 662, or removable optical disk 664 can be an information mediumhaving recorded information thereon. The information medium has a dataarea for recording stream data using stream packets each of whichincludes a packet area containing one or more data packets. By way ofexample, each data packet is encoded and decoded by a Codec ofapplication programs 672 executing in processing unit 644. As such, theencoder distributes the stream data to the data packet areas in thestream packets so that the distributed stream data are recorded in thedata packet areas using an encoding algorithm. Alternatively, encodingand decoding of data packets can be performed as a function of operatingsystem 670 executing on processing unit 644.

The hard disk drive 656, magnetic disk drive 658, and optical disk drive662 are connected to the system bus 648 by an SCSI interface 666 or someother appropriate interface. The drives and their associatedcomputer-readable media provide nonvolatile storage of computer readableinstructions, data structures, program modules and other data forcomputer 642. Although the exemplary environment described hereinemploys a hard disk, a removable magnetic disk 660 and a removableoptical disk 664, it should be appreciated by those skilled in the artthat other types of computer readable media which can store data that isaccessible by a computer, such as magnetic cassettes, flash memorycards, digital video disks, random access memories (RAMs) read onlymemories (ROM), and the like, may also be used in the exemplaryoperating environment.

A number of program modules may be stored on the hard disk, magneticdisk 660, optical disk 664, ROM 650, or RAM 652, including an operatingsystem 670, one or more application programs 672 (which may include theCodec), other program modules 674, and program data 676. A user mayenter commands and information into computer 642 through input devicessuch as keyboard 678 and pointing device 680. Other input devices (notshown) may include a microphone, joystick, game pad, satellite dish,scanner, or the like. These and other input devices are connected to theprocessing unit 644 through an interface 682 that is coupled to thesystem bus. A monitor 684 or other type of display device is alsoconnected to the system bus 648 via an interface, such as a videoadapter 686. In addition to the monitor, personal computers typicallyinclude other peripheral output devices (not shown) such as speakers andprinters.

Computer 642 operates in a networked environment using logicalconnections to one or more remote computers, such as a remote computer688. The remote computer 688 may be another personal computer, a server,a router, a network PC, a peer device or other common network node, andtypically includes many or all of the elements described above relativeto computer 642, although only a memory storage device 690 has beenillustrated in FIG. 6. The logical connections depicted in FIG. 6include a local area network (LAN) 692 and a wide area network (WAN)694. Such networking environments are commonplace in offices,enterprise-wide computer networks, intranets, and the Internet. In thedescribed embodiment of the invention, remote computer 688 executes anInternet Web browser program such as the Internet Explorer® Web browsermanufactured and distributed by Microsoft Corporation of Redmond, Wash.

When used in a LAN networking environment, computer 642 is connected tothe local network 692 through a network interface or adapter 696. Whenused in a WAN networking environment, computer 642 typically includes amodem 698 or other means for establishing communications over the widearea network 694, such as the Internet. The modem 698, which may beinternal or external, is connected to the system bus 648 via a serialport interface 668. In a networked environment, program modules depictedrelative to the personal computer 642, or portions thereof, may bestored in the remote memory storage device. It will be appreciated thatthe network connections shown are exemplary and other means ofestablishing a communications link between the computers may be used.

Generally, the data processors of computer 642 are programmed by meansof instructions stored at different times in the variouscomputer-readable storage media of the computer. Programs and operatingsystems are typically distributed, for example, on floppy disks orCD-ROMs. From there, they are installed or loaded into the secondarymemory of a computer. At execution, they are loaded at least partiallyinto the computer's primary electronic memory. The invention describedherein includes these and other various types of computer-readablestorage media when such media contain instructions or programs forimplementing the steps described below in conjunction with amicroprocessor or other data processor. The invention also includes thecomputer itself when programmed according to the methods and techniquesdescribed below. Furthermore, certain sub-components of the computer maybe programmed to perform the functions and steps described below. Theinvention includes such sub-components when they are programmed asdescribed. In addition, the invention described herein includes datastructures, described below, as embodied on various types of memorymedia.

For purposes of illustration, programs and other executable programcomponents such as the operating system are illustrated herein asdiscrete blocks, although it is recognized that such programs andcomponents reside at various times in different storage components ofthe computer, and are executed by the data processor(s) of the computer.

Conclusion

Implementations disclosed herein define a wire format that can be usedin delivery of multimedia data between server and client and peer topeer via RTP. The wire format allows for greater flexibility than thecurrently adopted IETF RFC 1889 standards for RTP delivery.Implementations of the wire format provide for streaming of encrypteddata, provide a mechanism for delivering per sample metadata via RTP,and provide for streaming of data that is protected with WM DRM.

Although the invention has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the invention defined in the appended claims is not necessarilylimited to the specific features or acts described. Rather, the specificfeatures and acts are disclosed as exemplary forms of implementing theclaimed invention.

1. An apparatus comprising: means for encrypting a data stream with anarbitrary block size to form a plurality of encryption units; and meansfor packetizing the plurality of encryption units into a plurality RTPpackets each including: an RTP packet header; one or more payloads of acommon data stream and selected from the group consisting of: one ormore said encryption units; fragment of one said encryption unit; andone RTP payload format header for each said payload and including, forthe corresponding encryption units, a boundary for the arbitrary blocksize.
 2. The apparatus as defined in claim 1, further comprising: meansfor reassembling the plurality of encryption units using: the payloadsin the plurality RTP packets; and the respective boundary for thearbitrary block size in the respective RTP payload format header; meansfor decrypting the plurality of encryption units to form the datastream.
 3. The apparatus as defined in claim 2, wherein: each said RTPpayload format header further comprises one or more attributes of thecorresponding payload; and the apparatus further comprises means forrendering the formed data stream using the attributes of thecorresponding payload.
 4. The apparatus as defined in claim 2, whereinthe attributes in each said RTP payload format header are selected fromthe group consisting of: timing information; and video compression frameinformation.
 5. The apparatus as defined in claim 2, further comprisingmeans for transmitting the plurality of RTP packets over a network. 6.An apparatus comprising: means for logically separating media data typein a data stream including a plurality of said media data types; andmeans for forming a plurality of RTP packets from the data stream, eachsaid RTP packet including: only one said media data type; an RTP packetheader; one of more variable length RTP payload format headers eachhaving one or more attributes; and an RTP payload corresponding to eachsaid RTP payload format header and being described by the one or moreattributes therein.
 7. The apparatus as defined in claim 6, furthercomprising: means for extracting the payloads from the plurality of RTPpackets; and means for rendering each payload in the plurality of RTPpackets using the one or more attributes in the corresponding RTPpayload format header.
 8. The apparatus as defined in claim 7, wherein:each said payload comprises video data; and the attributes in each saidRTP payload format header are selected from the group consisting of:timing information; and video compression frame information.
 9. Theapparatus as defined in claim 7, wherein the means for extractingfurther comprises, for each said RTP payload: means, where the RTPpayload includes a plurality of portions of one of the media data types,for assembling the plurality of portions of one of the media data typesinto a contiguous payload; means, where the RTP payload includes oneportion of one of the media data types, for assembling the one portionof one of the media data types into a contiguous payload; and means,where the RTP payload includes a fragment of one portion of one of themedia data types, for assembling all of the fragments of the one portionof one of the media data types into a contiguous payload.
 10. Theapparatus as defined in claim 9, further comprising: means forassembling, in respective chronological order corresponding to theplurality of media data types of the media file, the contiguouspayloads; and means for simultaneously rendering the chronologicallyordered contiguous payloads of the plurality of media data types of themedia file.
 11. A data structure having a wire format for transmissionover a network, the data structure comprising a plurality of singlemedia packets formed from a plurality of mixed media packets, wherein:each mixed media packet includes: a payload for each of a plurality ofdata streams, wherein the payload is encrypted and has an arbitraryblock size; and a payload header for each payload and including aboundary for the arbitrary block size; each single media packet includesone data stream, corresponds to one of the mixed media packets, andincludes: one payload corresponding to one of the payloads in the onemixed media packet; a payload profile format header corresponding to:the one payload; and one of more payload headers of the one mixed mediapacket, wherein the payload profile format header has a boundarycorresponding to: the respective boundaries of the one of more payloadheaders of the one mixed media packet; and the one payload.
 12. The datastructure of claim 11, wherein each single media packet furthercomprises: a packet header corresponding to one or more packet headersof the plurality of mixed media packets; a composition selected from thegroup consisting of: a plurality of the payloads of the mixed mediapackets, being of like data stream, each having a corresponding saidpayload profile format header; and one said payload and a correspondingsaid payload profile format header.
 13. The data structure of claim 11,wherein each single media packet is less than a predetermined size thatis a function selected from the group consisting of: a physicalcharacteristic of an underlying network; an administrative policy withrespect to packet size; and an assessment of the transmission bandwidthof the underlying network.
 14. The data structure of claim 11, whereinthe payload boundary in the single media packet identifies thechronological order of the corresponding payload in the one mixed mediapacket.
 15. The data structure of claim 11, wherein the one said datastream is selected from the group consisting audio data, video data,program data, JPEG Data, HTML data, and MIDI data.
 16. The datastructure of claim 11, wherein: the payload profile format headerincludes a fixed length portion and a variable length portion; and thevariable length portion includes attributes of the correspondingpayload.
 17. The data structure of claim 11, wherein: each said mixedmedia packet includes a portion of an ASF data stream, an ASF packetheader, and at least one ASF payload header; and each said single mediapacket includes, an RTP packet header, and one RTP payload formatheader; a portion of an RTP data stream.
 18. A method comprising:encrypting a data stream with an arbitrary block size to form aplurality of encryption units; and packetizing the plurality ofencryption units into a plurality RTP packets each including: an RTPpacket header; one or more payloads of a common data stream and selectedfrom the group consisting of: one or more said encryption units; and afragment of one said encryption unit; one RTP payload format header foreach said payload and including, for the corresponding encryption units,a boundary for the arbitrary block size.
 19. The method as defined inclaim 18, further comprising: reassembling the plurality of encryptionunits using: the payloads in the plurality RTP packets; and therespective boundary for the arbitrary block size in the respective RTPpayload format header; decrypting the plurality of encryption units toform the data stream.
 20. The method as defined in claim 19, wherein:each said RTP payload format header further comprises one or moreattributes of the corresponding payload; and the method furthercomprises rendering the formed data stream using the attributes of thecorresponding payload.
 21. The method as defined in claim 19, whereinthe attributes in each said RTP payload format header are selected fromthe group consisting of: timing information; and video compression frameinformation.
 22. The method as defined in claim 19, further comprising,prior to the reassembling, the plurality RTP packets over a network to aclient at which the reassembling is preformed.
 23. A computer readablemedium comprising machine readable instructions that, when executed,perform the method of claim
 18. 24. A method comprising forming aplurality of RTP packets from a data stream including a plurality ofmedia data types, each said RTP packet including: only one said mediadata type; an RTP packet header; one of more variable length RTP payloadformat headers each having one or more attributes; and an RTP payloadcorresponding to each said RTP payload format header and being describedby the one or more attributes therein.
 25. The method as defined inclaim 24, further comprising: extracting the payloads from the pluralityof RTP packets; and rendering each payload in the plurality of RTPpackets using the one or more attributes in the corresponding RTPpayload format header.
 26. The method as defined in claim 25, whereinthe attributes in each said RTP payload format header are selected fromthe group consisting of: timing information; and video compression frameinformation.
 27. The method as defined in claim 25, wherein theextracting the payloads from the plurality of RTP packets furthercomprises, for each said RTP payload: that includes a plurality ofportions of one of the media data types, assembling the plurality ofportions of one of the media data types into a contiguous payload; thatincludes one portion of one of the media data types, assembling the oneportion of one of the media data types into a contiguous payload; andthat includes a fragment of one portion of one of the media data types,assembling all of the fragments of the one portion of one of the mediadata types into a contiguous payload.
 28. The method as defined in claim27, further comprising: assembling, in respective chronological ordercorresponding to the plurality of media data types of the media file,the contiguous payloads; and simultaneously rendering thechronologically ordered contiguous payloads of the plurality of mediadata types of the media file.
 29. A computer readable medium comprisingmachine readable instructions that, when executed, perform the method ofclaim
 25. 30. A method comprising changing a plurality of mixed mediapackets into a plurality of single media packets, wherein: each mixedmedia packet includes: a payload for each of a plurality of datastreams, wherein the payload is encrypted and has an arbitrary blocksize; a payload header for each payload and including a boundary for thearbitrary block size; each single media packet includes one data stream,corresponds to one of the mixed media packets, and includes: one payloadcorresponding to one of the payloads in the one mixed media packet; apayload profile format header corresponding to: the one payload; and oneof more payload headers of the one mixed media packet, wherein thepayload profile format header has a boundary corresponding to: therespective boundaries of the one of more payload headers of the onemixed media packet; and the one payload.
 31. The method of claim 30,wherein each single media packet further comprises: a packet headercorresponding to one or more packet headers of the plurality of mixedmedia packets; a composition selected from the group consisting of: aplurality of the payloads of the mixed media packets, being of like datastream, each having a corresponding said payload profile format header;and one said payload and a corresponding said payload profile formatheader.
 32. The method of claim 30, wherein each single media packet isless than a predetermined size that is a function selected from thegroup consisting of: a physical characteristic of an underlying network;an administrative policy with respect to packet size; and an assessmentof the transmission bandwidth of a network.
 33. The method of claim 30,wherein the payload boundary in the single media packet identifies thechronological order of the corresponding payload in the one mixed mediapacket.
 34. The method of claim 30, wherein the one said data stream isselected from the group consisting audio data, video data, program data,JPEG Data, HTML data, and MIDI data.
 35. The method of claim 30,wherein: the payload profile format header includes a fixed lengthportion and a variable length portion; and the variable length portionincludes attributes of the corresponding payload.
 36. The method ofclaim 30, wherein: each said mixed media packet includes a portion of anASF data stream, an ASF packet header, and at least one ASF payloadheader; and each said single media packet includes, an RTP packetheader, and one RTP payload format header; a portion of an RTP datastream.
 37. A computer readable medium comprising machine readableinstructions that, when executed, perform the method of claim
 30. 38. Amethod comprising changing a plurality of mixed media packets into aplurality of single media packets, wherein: each mixed media packetincludes: a payload for each of a plurality of data streams, wherein thepayload is encrypted and has an arbitrary block size; a packet header;and a payload header for each payload and including a boundary for thearbitrary block size; each single media packet corresponds to one of themixed media packets and includes: one payload corresponding to one ofthe payloads in the one mixed media packet; a packet headercorresponding to one of the packet headers of the one mixed mediapacket; a payload profile format header corresponding to: the onepayload; and one of more payload headers of the one mixed media packet;wherein the payload profile format header has a payload boundarycorresponding to: the respective payload boundaries of the one of morepayload headers of the one mixed media packet; and the one payload. 39.The method of claim 38, wherein: each said mixed media packet includes aportion of an ASF data stream, an ASF packet header, and at least oneASF payload header; and each said single media packet includes, an RTPpacket header, and one RTP payload format header; a portion of an RTPdata stream.
 40. The method of claim 38, wherein: the payload profileformat header includes a fixed length portion and a variable lengthportion; and the variable length portion includes attributes of thecorresponding payload.
 41. A computer readable medium comprising machinereadable instructions that, when executed, perform the method of claim38.
 42. A method comprising changing a plurality of single media packetsinto a composite packet, wherein: each single media packet includes: apayload of one data stream, wherein the payload is encrypted and has anarbitrary block size; a payload header for the payload and including aboundary for the arbitrary block size; the composite packet correspondsto the plurality of single media packets and includes: one or morepayloads of a like data stream and corresponding to the respectivepayloads of the plurality of single media packets; and a payload profileformat header for each said payload in the composite packet andcorresponding to the payload headers of the plurality of single mediapackets, wherein the payload profile format header has a payloadboundary for a respective said payload in the composite packet thatidentifies an order thereof in the plurality of single media packets.43. The method of claim 42, wherein the composite packet furthercomprises: a packet header corresponding to packet headers for each ofthe plurality of single media packets; a composition selected from thegroup consisting of: a plurality of said payloads each having acorresponding said payload profile format header; and one said payloadand a corresponding said payload profile format header.
 44. The methodof claim 42, wherein each single media packet is less than apredetermined size that is a function of selected from the groupconsisting of: a physical characteristic of an underlying network; anadministrative policy with respect to packet size; and an assessment ofthe transmission bandwidth of the underlying network.
 45. The method ofclaim 42, wherein the data stream is selected from the group consistingaudio data, video data, program data, JPEG Data, HTML data, and MIDIdata.
 46. The method of claim 42, wherein: each said mixed media packetincludes a portion of an ASF data stream, an ASF packet header, and atleast one ASF payload header; and each said single media packetincludes, an RTP packet header, and one RTP payload format header; aportion of an RTP data stream.
 47. The method of claim 42, wherein: thepayload profile format header includes a fixed length portion and avariable length portion; and the variable length portion includesattributes of the corresponding payload.
 48. A computer readable mediumcomprising machine readable instructions that, when executed, performthe method of claim
 42. 49. A client computing device comprising aprocessor for executing logic configured to: send a request for a mediafile including a plurality of media data types; receive streaming mediain a plurality of RTP packets corresponding to the media file andincluding: only one said media data type; an RTP packet header; one ofmore RTP payload format headers each including an RTP payload boundary;and an RTP payload for and corresponding to each said RTP payload formatheader, wherein the RTP payload is encrypted and has an arbitrary blocksize corresponding to the RTP payload boundary, each said RTP payloadbeing selected from the group consisting of: a plurality of portions ofone of the media data types; one portion of one of the media data types;and a fragment of one portion of one of the media data types; for eachsaid RTP payload in the received RTP packets: that includes a pluralityof portions of one of the media data types, assemble the plurality ofportions of one of the media data types into a contiguous payload usingthe RTP payload boundary of the corresponding RTP payload format header;that includes one portion of one of the media data types, assemble theone portion of one of the media data types into a contiguous payloadusing the RTP payload boundary of the corresponding RTP payload formatheader; and that includes a fragment of one portion of one of the mediadata types, assemble all of the fragments of the one portion of one ofthe media data types into a contiguous payload using each said RTPpayload boundary of the corresponding RTP payload format headers;assemble, in respective chronological order corresponding to theplurality of media data types of the media file, the contiguouspayloads; and simultaneously render the chronologically orderedcontiguous payloads of the plurality of media data types of the mediafile.
 50. The client computing device of claim 49, wherein the pluralityof RTP packets are variable is size and less than a predetermined sizethat is a function selected from the group consisting of: an assessmentof the transmission bandwidth of an underlying network from which theplurality of RTP packets was received; a physical characteristic of theunderlying network; and an administrative policy with respect to packetsize.
 51. The client computing device of claim 49, wherein each said RTPpayload boundary identifies the chronological order of the correspondingRTP payload in the media data type of the media file.
 52. The clientcomputing device of claim 49, wherein each said media data type isselected from the group consisting audio data, video data, program data,JPEG Data, HTML data, and MIDI data.
 53. The client computing device ofclaim 49, wherein: each said RTP payload format header includes a fixedlength portion and a variable length portion; and the variable lengthportion includes attributes of the corresponding RTP payload.
 54. Aclient computing device comprising a processor for executing logicconfigured to: send a request for a media file including audio and videodata; receive a plurality of RTP packets corresponding to a plurality ofASF packets for the media file, wherein: each said ASF packet includes:an ASF packet header; and one of more ASF payload headers each includingan ASF payload boundary for a corresponding ASF payload, wherein the ASFpayload is encrypted with an arbitrary block size corresponding to theASF payload boundary; the ASF payload for and corresponding to each saidASF payload header is selected from the group consisting of: some of theaudio data including an audio sample or fragment thereof; and some ofthe video data including a video sample or fragment thereof; each saidRTP packet includes: either some of the audio data or some of the videodata; an RTP packet header corresponding to at least one of the ASFpacket headers; one of more RTP payload format headers corresponding toat least one of the ASF payload headers, wherein each said RTP payloadformat header includes an RTP payload boundary corresponding to at leastone of the ASF payload boundaries; and an RTP payload for andcorresponding to each said RTP payload format header, each said RTPpayload being selected from the group consisting of: a plurality of theASF payloads; one of the ASF payloads; and a fragment of one of the ASFpayloads; for each said RTP payload in the received RTP packets: thatincludes a plurality of the ASF payloads, assemble the plurality of theASF payloads into a contiguous payload using the RTP payload boundary ofthe corresponding RTP payload format header; that includes one of theASF payloads, assemble the one said ASF payload into a contiguouspayload using the RTP payload boundary of the corresponding RTP payloadformat header; and that includes a fragment of one of the ASF payloads,assemble all of the fragments of the one of the ASF payloads into acontiguous payload using each said RTP payload boundary of thecorresponding RTP payload format headers; assemble, in respectivechronological order corresponding to the audio and video data of themedia file, the contiguous payloads; and simultaneously render thechronologically ordered contiguous payloads of both the audio data ofthe media file and the video data of the media file.
 55. The clientcomputing device of claim 54, wherein the RTP packets are variable insize and less than a predetermined size that is a function of oneselection from the group consisting of: an assessment of thetransmission bandwidth of an underlying network from which the pluralityof RTP packets was received; a physical characteristic of the underlyingnetwork; an administrative policy with respect to packet size; the sizeof the ASF packets that correspond to the received plurality of RTPpackets; and a combination of the foregoing.
 56. The client computingdevice of claim 54, wherein each said ASF payload boundary identifiesthe respective chronological order of the corresponding ASF payload inone of: the audio data in the media file; and the video data in themedia file.
 57. The client computing device of claim 54, wherein eachsaid RTP payload boundary identifies the respective chronological orderof the corresponding RTP payload in one of: the audio data in the mediafile; and the video data in the media file.
 58. The client computingdevice of claim 54, wherein each said RTP payload boundary identifiesthe respective chronological order of the corresponding RTP payload inone of: a plurality of the ASF payloads; and a fragment of one of theASF payloads.
 59. The client computing device of claim 54, wherein: eachsaid RTP payload format header includes a fixed length portion and avariable length portion; and the variable length portion includesattributes of the corresponding RTP payload.